-
-
Notifications
You must be signed in to change notification settings - Fork 68
Multivariate Structural Statespace Components #529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Multivariate Structural Statespace Components #529
Conversation
This is cool! I will review ASAP. Note that #450 is currently blocked by what I think is a pytensor bug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is 🔥 @jessegrabowski 🤯
I just left a suggestion for what I think was a typo in the docstring. I'll merge once this is resolved, and then test all of this for our PyData tutorial -- probably this weekend.
Just a quick question: IIUC, now users can also have batched RegressionComponent
s, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is 🔥 @jessegrabowski 🤯
I just left a suggestion for what I think was a typo in the docstring.
Still missing this feature are:
-
Cycle
(currently worked on by @AlexAndorra) -
Seasonal
-
Regression
(currently worked on by @Dekermanjian)
We also need to:
- Make sure that there are tests that combined LevelTrend + AR + error for two observed variables with no interaction model matches two separate models for each, given the same parameters.
- Make sure that pytensor ops are used everywhere for building the SS matrices (no numpy/scipy)
I think I'm done for a first review from you on the |
2. Adjusted the regression component to allow multivariate regression component specification 3. Added a notebook for quick evaluation of the adjustments and additions made
2. replaced scipy block diag with pytensor block diag 3. Added forecast to test model in multivariate ssm notebook
Added multivariate regression-component
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexAndorra I left comments for you
Since it's my own PR I can't request changes. It's better in future if you fork the PR branch and open a new PR into this PR, then we can do the usual review workflow on your PR and merge it into this PR when we're ready
design_matrix = linalg.block_diag(*[Z for _ in range(self.k_endog)]) | ||
self.ssm["design", :, :] = pt.as_tensor_variable(design_matrix) | ||
|
||
R = np.eye(2) # 2x2 identity for each cycle component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if innovations=False
, does R need to be adjusted in that case?
It was like this before your changes so don't worry about it, but it might need a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok? From what I understood (but speaking under your control):
R
defines the structure: which states can receive innovationsQ
defines the magnitude: how much innovation they receive- When
Q = 0
, the structure becomes irrelevant: no innovations occur
IIRC, the innovation variance affecting states is R @ Q @ R.T
, which is 0 when Q = 0
, regardless of R
's structure.
LMK if I'm way off here.
I made it clearer in the code, with some comments in make_symbolic_graph
_assert_basic_coords_correct(cycle) | ||
|
||
|
||
def test_cycle_multivariate_deterministic(rng): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this test, eval the transition, design, and selection matrices and make sure they are what they are supposed to be (check the level_trend tests for an example)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, although my code might be too verbose and inefficient. LMK if I can improve it
assert_allclose(ratio_0, ratio_i, atol=1e-2, rtol=1e-2) | ||
|
||
|
||
def test_cycle_multivariate_with_innovations_and_cycle_length(rng): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, directly inspect the 3 relevant matrices (plus state_cov I suppose)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
@AlexAndorra @Dekermanjian I want the names of parameters in the components to be really consistent and unsurprising. So please vote on:
Concrete examples for (3): For (4), I'm talking about the internal state names that will end up as labels for the R and Q matrices, nothing else. Also for default names, since all the are going to depend on the names in the multivariate case, should we:
|
@AlexAndorra Also please add a test adding a cycle component to another cycle component with a different observed state name. Check the resulting matrices come out as expected. |
@Dekermanjian the regression component tests are failing because of this line: betas = self.make_and_register_variable(f"beta_{self.name}", shape=(k_endog, k_states)) You need to drop the betas = self.make_and_register_variable(f"beta_{self.name}", shape=(k_endog, k_states) if k_endog > 1 else (k_states, ) Also you have Finally, there are some unused computations -- you had assigned PS: Please add some multivariate regression tests (I think you already are, but just so it's on the record somewhere, here it is), including a test where you add together two regression components with different state names |
I personally prefer these:
for multivariate, I personally prefer going with option 2 or option 3 if it will also apply to the univariate case. I would prefer that univariate and multivariate are as consistent as would be possible. |
@Dekermanjian I went ahead and fixed up the regression component, so please make sure to pull before you keep working. The last steps before we merge this are to:
I think we can also take the rough notebook that @Dekermanjian started and turn it into a tutorial about multiple observed, in the same spirit as the existing one. We can make that a separate PR if we want, but if so, we should drop the notebook from this PR. @AlexAndorra I need your input on the naming questions I posed above, then we can make a final decision and go with it. Once that's settled, the |
Thanks for the incredibly fast progress @jessegrabowski !! Do you need a new review from me? Here are my votes:
For default names: |
# selection matrix R defines structure of innovations (always identity for cycle components) | ||
# when innovations=False, state cov Q=0, hence R @ Q @ R.T = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this to make the R logic clearer @jessegrabowski
else: | ||
# explicitly set state cov to 0 when no innovations | ||
self.ssm["state_cov", :, :] = pt.zeros((self.k_posdef, self.k_posdef)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: added this clause to make the logic clearer @jessegrabowski . LMK is superfluous
np.testing.assert_allclose(Q, expected_Q) | ||
|
||
|
||
def test_add_multivariate_cycle_components_with_different_observed(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jessegrabowski , you wrote:
Also please add a test adding a cycle component to another cycle component with a different observed state name. Check the resulting matrices come out as expected.
That's what I tried doing with this test, but LMK if that's overly complex, or even wrong, I wasn't super sure
Looks like we have consensus on To that end, let's also go with Regarding greek/descriptive, I'm really on the fence. I understand why @AlexAndorra is super anti-greek, and I've done implementations that specifically remove greek names (changing beta and gamma in BatchNorm to loc and scale here, for example). On the other hand, we do pay a cost when inventing our own names, unless they are already widely known/used in the specific literature. In the In actual fact, we already mix descriptive and greek names. In My suggestion is to punt on this and open a new issue to address package-wide naming conventions. For now, let's just make sure everything is of the form |
Ha ha, literally the reasoning behind my choice.
Totally agree, it's just that the way I see it, the cost is internal: even if the descriptive names are only our own (which is not the case here, IIUC):
And I like
Masterpiece, nothing to add, you killed me 🤣 |
This PR lifts the requirement that models built with the structural sub-module of PyMC be univariate. It's a chonky PR, so I split it into commits. Most of the files changes are changed by the first commit, which is just reorganization of files. It is safe to ignore that one.
Here are the steps I followed:
Reorganize structural model modlue commit
Allow combination of component with different numbers of observed states
PR. I am confident this code can be improved.For now, we assume all states in a component follow the same parameterization. It's now also valid to add together the same component twice with different states to work around this (e.g.
AutoRegressive(order=1, observed_state_names=['data_1']) + Autoregressive(order=5, observed_state_names=['data_2'])
) would be a valid model with 2 observed states, but each has it's own autoregressive dynamics.When you pass a batch of observed_state_names, e.g.
LevelTrend(order=2, observed_state_names=['data_1', 'data_2'])
, the parameters will all be given a batch dimension, but will otherwise be the same as the base case.More docs coming, but I tried obsessively document what in there so far.
The logic for extending the components is pretty straight-forward -- mostly copying + block_diag or concat, but there are some corner cases that need attention.
This PR should be seen as a companion to #450. Instead of vectorizing across the computation of a model, we're concatenating models. There will be cases where this is superior -- for example when you want to explicitly model latent interactions between components. But in other cases, this approach will be worse. I am interested in having both.